Quasi-automatic extraction of tong large existing speech cineradiog
نویسنده
چکیده
Automatic analysis of tongue movement in large existing cineradiographic databases can provide valuable information to understood speech production. We describe here a method for semi-automatic extraction of articulatory information from video observation in order to derive quasi-automatically a geometrical parameterization of the vocal tract movements. The algorithm starts with a limited manual processing step consisting in marking 10 points (12 degrees of freedom) on 100 chosen key images. The treatment on the whole sequence is then automatic thanks to a retro-marking method. At first, the whole database is indexed via a similarity measure performed with the key images. Then, we associate on the original images the geometrical information recovered on the key images via this indexing. Different complementary error reduction methods are also proposed. Averaging geometrical configurations of a neighborhood, temporal filtering and spline interpolation allow to reduce the reconstruction error to about 10 pixels for a tongue contour of average length of 260 pixels.
منابع مشابه
Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database
Automatic analysis of tongue movement in large existing cineradiographic databases can provide valuable information to understood speech production. We describe here a method for semi-automatic extraction of articulatory information from video observation in order to derive quasi-automatically a geometrical parameterization of the vocal tract movements. The algorithm starts with a limited manua...
متن کاملTowards High Performance Phonotactic Feature for Spoken Language Recognition
With the demands of globalization, multilingual speech is increasingly common in conversational telephone speech, broadcast news and internet podcasts. Therefore, automatic spoken language recognition has become an important technology in multilingual speech related applications. For example, automatic spoken language recognition has been used as a preprocessing component for spoken language tr...
متن کاملDiscriminant Training of Front-End and Acoustic Modeling Stages to Heterogeneous Acoustic Environments for Multi-stream Automatic Speech Recognition
Automatic Speech Recognition (ASR) still poses a problem to researchers. In particular, most ASR systems have not been able to fully handle adverse acoustic environments. Although a large number of modi cations have resulted in increased levels of performance robustness, ASR systems still fall short of human recognition ability in a large number of environments. A possible shortcoming of the ty...
متن کامل1998 Hub-4 Information Extraction Evaluation
This paper documents the Information Extraction Named-Entity Evaluation (IE-NE), one of the new spokes added to the DARPA-sponsored 1998 Hub-4 Broadcast News Evaluation. This paper discusses the information extraction task as posed for the 1998 Broadcast News Evaluation. This paper reviews the evaluation metrics, the scoring process, and the test corpus that was used for the evaluation. Finally...
متن کاملAn elitist approach to automatic articulatory-acoustic feature classification for phonetic characterization of spoken language
A novel framework for automatic articulatory-acoustic feature extraction has been developed for enhancing the accuracy of placeand manner-of-articulation classification in spoken language. The ‘‘elitist’’ approach provides a principled means of selecting frames for which multi-layer perceptron, neural-network classifiers are highly confident. Using this method it is possible to achieve a frame-...
متن کامل